33 - Demo 2 - State-of-Art Methods (Vanessa Wirth) [ID:35645]
50 von 517 angezeigt

So good morning everyone.

This is the second demo session of today's computer vision lecture.

As you might remember the last time we did traditional methods and now we are focusing

on especially learning based methods.

And I also think that the advertisement that you heard before is pretty much also maybe

a hint how important computer vision is nowadays.

And in this lecture you will also see maybe possible methods how you could solve the problems

that were mentioned before in the advertisement.

So this session is split into two parts.

First of all we are examining 2D scene understanding or in general 2D computer vision how we can

sort of gain knowledge from 2D data.

And in the next section of this lecture we will focus more of how we can actually gain

a 3D understanding just on the basis of 2D data.

I will first start with a little introduction just for the ones who did never see anything

in connection with deep learning methods.

And I will give a brief introduction what common tasks are in the field of 2D computer

vision.

In general those are the pretty much state of the art tasks that we have in 2D computer

vision.

You might have heard something about semantic segmentation where the goal is to segment

the scene that you're seeing just on the basis of a given image.

You might heard already about classification where the goal is just when you have one single

object in the image to classify what category this object belongs to.

And you also might have already heard about a combination of classification with localization

still with a focus on one single object because you classify the object and later you also

try to estimate the region where this object lies in as a form of 2D bounding box for example.

And now we're getting to the more interesting parts because this is getting a little bit

more complicated as soon as we have multiple objects.

Because then we have also the need to distinguish those objects from each other.

And we start with object detection where the goal is basically the same as in the single

object case.

We want to classify the objects we are seeing and we also want to estimate the region where

they are located.

And the combination with multiple object is called object detection.

And this is a pretty much already well searched but still pretty interesting research direction.

A lot of people are working on that improving learning methods to do object detection.

And in this example we have multiple animals that needed to be detected.

And the network or rather the learning approach that you're using is then also outputting

the class of those objects it is seeing.

And lastly we have something that is called instant segmentation.

Why is it called instant segmentation and not just semantic segmentation as we heard

before?

The crucial part about instant segmentation is now we have if we are seeing multiple objects

there could be the case that we see multiple instances of the same category.

So we can see for example multiple cats.

So we have to distinguish those cats in some sort of each other.

So the network is actually assigning them instances and they are treated as completely

different objects during the learning process.

And as we already saw before those objects once will get a label.

So we have those two cats here.

Teil einer Videoserie :

Presenters

Zugänglich über

Offener Zugang

Dauer

00:49:08 Min

Aufnahmedatum

2021-07-12

Hochgeladen am

2021-07-12 12:57:11

Sprache

en-US

In this session, Vanessa Wirth shows live demos "Learning-based methods" for Object Detection and Semantic Instance Segmentation. She also presents the CAD model alignment in indoor scenes, which also uses Bundle Fusion (Dai et. al 2017: https://dl.acm.org/doi/abs/10.1145/3072959.3054739), which was developed at FAU.

Vanessa is currently pursuing her PhD at the Chair of Visual Computing, Department of Computer Science. She has open thesis/projects related to 3D reconstruction of objects and human bodies: https://www.lgdv.tf.fau.de/person/vanessa-wirth/ 

Tags

Computer Vision
Einbetten
Wordpress FAU Plugin
iFrame
Teilen